Search CORE

183 research outputs found

Region-based and pathway-based QTL mapping using a p-value combination method

Author: Chia-Wei Chen
DV Zaykin
G Peng
H-C Yang
HC Yang
HC Yang
Hsin-Chou Yang
K Yu
N Siva
S Purcell
Y Benjamini
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Quantitative trait locus (QTL) mapping using deep DNA sequencing data is a challenging task. In this study we performed region-based and pathway-based QTL mappings using a p-value combination method to analyze the simulated quantitative traits Q1 and Q4 and the exome sequencing data. The aims were to evaluate the performance of the QTL mapping approaches that were used and to suggest plausible strategies for QTL mapping of DNA sequencing data. We conducted single-locus QTL mappings using a linear regression model with adjustments for age and smoking status, and we also conducted region-based and pathway-based QTL mappings using a truncated product method for combining p-values from the single-locus QTL mapping. To account for the features of rare variants and common single-nucleotide polymorphisms (SNPs), we considered independently rare-variant-only, common-SNP-only, and combined analyses. An analysis of 200 simulated replications showed that the three region-based methods reasonably controlled type I error, whereas the combined analysis yielded the greatest statistical power. Rare-variant-only, common-SNP-only, and combined analyses were also applied to pathway-based QTL mappings. We found that pathway-based QTL mappings had a power of approximately 100% when the significance of the vascular endothelial growth factor pathway was evaluated, but type I errors were slightly inflated. Our approach complements single-locus QTL mapping. An integrated approach using single-locus, combined region-based, and combined pathway-based analyses should yield promising results for QTL mapping of DNA sequencing data

Crossref

Springer

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Use of the gamma method for self-contained gene-set analysis of SNP data

Author: A Subramanian
Ann M Moyer
AR Gallant
B Efron
BL Fridley
BL Fridley
Brooke L Fridley
DB Allison
DV Zaykin
DV Zaykin
Gregory D Jenkins
I Dinu
JJ Goeman
JJ Goeman
Joanna M Biernacka
K Wang
K Wang
K Yu
L Li
LA Hindorff
Liewei Wang
LS Chen
MC Whitlock
N Niu
O De la Cruz
P Holmans
P Scheet
RA Fisher
RC Elston
SA McCarroll
WJ Gauderman
Publication venue: Nature Publishing Group
Publication date
Field of study

Gene-set analysis (GSA) evaluates the overall evidence of association between a phenotype and all genotyped single nucleotide polymorphisms (SNPs) in a set of genes, as opposed to testing for association between a phenotype and each SNP individually. We propose using the Gamma Method (GM) to combine gene-level P-values for assessing the significance of GS association. We performed simulations to compare the GM with several other self-contained GSA strategies, including both one-step and two-step GSA approaches, in a variety of scenarios. We denote a ‘one-step' GSA approach to be one in which all SNPs in a GS are used to derive a test of GS association without consideration of gene-level effects, and a ‘two-step' approach to be one in which all genotyped SNPs in a gene are first used to evaluate association of the phenotype with all measured variation in the gene and then the gene-level tests of association are aggregated to assess the GS association with the phenotype. The simulations suggest that, overall, two-step methods provide higher power than one-step approaches and that combining gene-level P-values using the GM with a soft truncation threshold between 0.05 and 0.20 is a powerful approach for conducting GSA, relative to the competing approaches assessed. We also applied all of the considered GSA methods to data from a pharmacogenomic study of cisplatin, and obtained evidence suggesting that the glutathione metabolism GS is associated with cisplatin drug response

Crossref

PubMed Central

Haplotype Estimation from Fuzzy Genotypes Using Penalized Likelihood

Author: A Dempster
AM Mehta
D Clayton
DV Zaykin
H Akaike
H Kang
Hae-Won Uh
HW Uh
HW Uh
J Marchini
JC Long
KL Ayers
L Excoffer
M Stephens
ME Hawley
Paul H. C. Eilers
R Thompson
R van Berloo
S Lin
SL Slager
T Niu
Thomas Mailund
TJ Hastie
ZS Qin
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

The Composite Link Model is a generalization of the generalized linear model in which expected values of observed counts are constructed as a sum of generalized linear components. When combined with penalized likelihood, it provides a powerful and elegant way to estimate haplotype probabilities from observed genotypes. Uncertain (“fuzzy”) genotypes, like those resulting from AFLP scores, can be handled by adding an extra layer to the model. We describe the model and the estimation algorithm. We apply it to a data set of accurate human single nucleotide polymorphism (SNP) and to a data set of fuzzy tomato AFLP scores

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

EUR Research Repository

Leiden University Scholary Publications

Erasmus University Digital Repository

Multiethnic Genetic Association Studies Improve Power for Locus Discovery

Author: A Keinan
AD Skol
AG Clark
AL Price
Benjamin F. Voight
DV Zaykin
E Zeggini
H Unoki
I Pe'er
JN Hirschhorn
K Yasuda
LJ Scott
MI McCarthy
Michael Nicholas Weedon
N Zaitlen
NA Rosenberg
Paul I. W. de Bakker
PI Lin
R Saxena
R Sladek
Sara L. Pulit
TA Manolio
V Steinthorsdottir
YY Teo
Publication venue: Public Library of Science
Publication date: 01/09/2010
Field of study

To date, genome-wide association studies have focused almost exclusively on populations of European ancestry. These studies continue with the advent of next-generation sequencing, designed to systematically catalog and test low-frequency variation for a role in disease. A complementary approach would be to focus further efforts on cohorts of multiple ethnicities. This leverages the idea that population genetic drift may have elevated some variants to higher allele frequency in different populations, boosting statistical power to detect an association. Based on empirical allele frequency distributions from eleven populations represented in HapMap Phase 3 and the 1000 Genomes Project, we simulate a range of genetic models to quantify the power of association studies in multiple ethnicities relative to studies that exclusively focus on samples of European ancestry. In each of these simulations, a first phase of GWAS in exclusively European samples is followed by a second GWAS phase in any of the other populations (including a multiethnic design). We find that nontrivial power gains can be achieved by conducting future whole-genome studies in worldwide populations, where, in particular, African populations contribute the largest relative power gains for low-frequency alleles (<5%) of moderate effect that suffer from low power in samples of European descent. Our results emphasize the importance of broadening genetic studies to worldwide populations to ensure efficient discovery of genetic loci contributing to phenotypic trait variability, especially for those traits for which large numbers of samples of European ancestry have already been collected and tested

Public Library of Science (PLOS)

Crossref

Harvard University - DASH

Directory of Open Access Journals

PubMed Central

Is FKBP5 a genetic marker of affective psychosis? A case control study and analysis of disease related traits

Author: AW Zobel
B Pfuhlmann
BS Weir
Burkhard Jabs
CR Sinars
DV Zaykin
EB Binder
EFC van Rossum
F Holsboer
Gerald Stoeber
GM Wochnik
H Vermeer
HS Akiskal
JA Magee
JE Schwartz
JM Westberry
K Leonhard
KC Koenen
Kerstin Moller-Ehrlich
Meinhard Mende
Micha Gawlik
Michael Jovnerovski
Michael Knapp
MT Tsuang
R Cheng
Sven Jung
T Becker
T Matsubara
World Health Organization (WHO)
Publication venue: BioMed Central
Publication date: 01/11/2006
Field of study

BACKGROUND: A dysregulation of the hypothalamic-pituitary-adrenal (HPA) axis has been proposed as an important pathogenic factor in depression. Genetic variants of FKBP5, a protein of the HPA system modulating the glucocorticoid receptor, have been reported to be genetically associated with improved response to medical treatment and an increase of depressive episodes. METHODS: We examined three single nucleotide polymorphisms (SNPs) in FKBP5, rs4713916 in the proposed promoter region, rs1360780 in the second intron and rs3800373 in the 3'-untranslated region (3'-UTR), in a case-control study of Caucasian origin (affective psychosis: n = 248; controls: n = 188) for genetic association and association with disease related traits. RESULTS: Allele and genotype frequencies of rs4713916, rs1360780 and rs3800373 were not significantly different between cases and controls. Two three-locus haplotypes, G-C-T and A-T-G, accounted for 86.2% in controls. Odds ratios were not increased between cases and controls, except the rare haplotype G-C-G (OR 6.81), representing 2.1% of cases and 0.3% of controls. The frequency of rs4713916AG in patients deviated from expected Hardy-Weinberg equilibrium, the genotype AA at rs4713916 in monopolar depression (P = 0.011), and the two-locus haplotype rs1360780T – rs3800373T in the total sample (overall P = 0.045) were nominally associated with longer continuance of disease. CONCLUSION: Our data do not support a significant genetic contribution of FKBP5 polymorphisms and haplotypes to affective psychosis, and the findings are inconclusive regarding their contribution to disease-related traits

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Assessing Significance in High-Throughput Experiments by Sequential Goodness of Fit and q-Value Estimation

Author: A Carvajal-Rodriguez
A Carvajal-Rodriguez
A Carvajal-Rodriguez
A Farcomeni
Antonio Carvajal-Rodriguez
AP Diz
B Efron
C Dalmasso
DV Zaykin
Ioannis P. Androulakis
J de Uña-Alvarez
J de Uña-Alvarez
J Storey
Jacobo de Uña-Alvarez
JD Storey
JD Storey
JD Storey
KI Kim
N Meinshausen
SB Pounds
W Barry
WH Press
Y Benjamini
Y Benjamini
Y Benjamini
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

We developed a new multiple hypothesis testing adjustment called SGoF+ implemented as a sequential goodness of fit metatest which is a modification of a previous algorithm, SGoF, taking advantage of the information of the distribution of p-values in order to fix the rejection region. The new method uses a discriminant rule based on the maximum distance between the uniform distribution of p-values and the observed one, to set the null for a binomial test. This new approach shows a better power/pFDR ratio than SGoF. In fact SGoF+ automatically sets the threshold leading to the maximum power and the minimum false non-discovery rate inside the SGoF' family of algorithms. Additionally, we suggest combining the information provided by SGoF+ with the estimate of the FDR that has been committed when rejecting a given set of nulls. We study different positive false discovery rate, pFDR, estimation methods to combine q-value estimates jointly with the information provided by the SGoF+ method. Simulations suggest that the combination of SGoF+ metatest with the q-value information is an interesting strategy to deal with multiple testing issues. These techniques are provided in the latest version of the SGoF+ software freely available at http://webs.uvigo.es/acraaj/SGoF.htm

CiteSeerX

Public Library of Science (PLOS)

Investigo

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

PubMed Central

Common ataxia telangiectasia mutated haplotypes and risk of breast cancer: a nested case–control study

Author: CI Szabo
CM West
D Cortez
David J Hunter
DO Stram
Donna Spiegelman
DSN Fallin
DV Zaykin
G Chenevix-Trench
Graham A Colditz
HMKL Inskip
JBG Chen
JL Bernstein
K Khanna
KAF Spring
L Izatt
M Meyn
MGBJ FitzGerald
MMD Swift
NAN Janin
P Bretsky
P Sebastiani
PESM Bonnen
Peter Kraft
R Letrero
Rulla M Tamimi
S Angèle
S Li
SN Teraoka
Susan E Hankinson
T Dork
YR Thorstenson
YRSP Thorstenson
Publication venue: BioMed Central
Publication date: 01/01/2004
Field of study

INTRODUCTION: The ataxia telangiectasia mutated (ATM) gene is a tumor suppressor gene with functions in cell cycle arrest, apoptosis, and repair of DNA double-strand breaks. Based on family studies, women heterozygous for mutations in the ATM gene are reported to have a fourfold to fivefold increased risk of breast cancer compared with noncarriers of the mutations, although not all studies have confirmed this association. Haplotype analysis has been suggested as an efficient method for investigating the role of common variation in the ATM gene and breast cancer. Five biallelic haplotype tagging single nucleotide polymorphisms are estimated to capture 99% of the haplotype diversity in Caucasian populations. METHODS: We conducted a nested case–control study of breast cancer within the Nurses' Health Study cohort to address the role of common ATM haplotypes and breast cancer. Cases and controls were genotyped for five haplotype tagging single nucleotide polymorphisms. Haplotypes were predicted for 1309 cases and 1761 controls for which genotype information was available. RESULTS: Six unique haplotypes were predicted in this study, five of which occur at a frequency of 5% or greater. The overall distribution of haplotypes was not significantly different between cases and controls (χ(2 )= 3.43, five degrees of freedom, P = 0.63). CONCLUSION: There was no evidence that common haplotypes of ATM are associated with breast cancer risk. Extensive single nucleotide polymorphism detection using the entire genomic sequence of ATM will be necessary to rule out less common variation in ATM and sporadic breast cancer risk

Crossref

ScholarWorks@UMass Amherst

Harvard University - DASH

Springer - Publisher Connector

PubMed Central

An Open Access Database of Genome-wide Association Results

Author: A Brazma
AD Johnson
Andrew D Johnson
BL Browning
BR Zeeberg
C Dong
Christopher J O'Donnell
CJ Willer
D Curtis
DM Kraus
DV Zaykin
E Hamano
E Zeggini
GK Chen
H Stoiber
J Fellay
KA Frazer
KJ Gaulton
KM Brown
LA Cupples
LL Field
M Fedetz
MD Mailman
QR Liu
R Saxena
R Thibault
S Knapp
SA Mousa
SF Saccone
SJ Chanock
TA Manolio
TG Lesnick
WTCCC consortium
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background The number of genome-wide association studies (GWAS) is growing rapidly leading to the discovery and replication of many new disease loci. Combining results from multiple GWAS datasets may potentially strengthen previous conclusions and suggest new disease loci, pathways or pleiotropic genes. However, no database or centralized resource currently exists that contains anywhere near the full scope of GWAS results. Methods We collected available results from 118 GWAS articles into a database of 56,411 significant SNP-phenotype associations and accompanying information, making this database freely available here. In doing so, we met and describe here a number of challenges to creating an open access database of GWAS results. Through preliminary analyses and characterization of available GWAS, we demonstrate the potential to gain new insights by querying a database across GWAS. Results Using a genomic bin-based density analysis to search for highly associated regions of the genome, positive control loci (e.g., MHC loci) were detected with high sensitivity. Likewise, an analysis of highly repeated SNPs across GWAS identified replicated loci (e.g., <it>APOE</it>, <it>LPL</it>). At the same time we identified novel, highly suggestive loci for a variety of traits that did not meet genome-wide significant thresholds in prior analyses, in some cases with strong support from the primary medical genetics literature (<it>SLC16A7, CSMD1, OAS1</it>), suggesting these genes merit further study. Additional adjustment for linkage disequilibrium within most regions with a high density of GWAS associations did not materially alter our findings. Having a centralized database with standardized gene annotation also allowed us to examine the representation of functional gene categories (gene ontologies) containing one or more associations among top GWAS results. Genes relating to cell adhesion functions were highly over-represented among significant associations (p < 4.6 × 10-14), a finding which was not perturbed by a sensitivity analysis. Conclusion We provide access to a full gene-annotated GWAS database which could be used for further querying, analyses or integration with other genomic information. We make a number of general observations. Of reported associated SNPs, 40% lie within the boundaries of a RefSeq gene and 68% are within 60 kb of one, indicating a bias toward gene-centricity in the findings. We found considerable heterogeneity in information available from GWAS suggesting the wider community could benefit from standardization and centralization of results reporting.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Integrated genomics and proteomics define huntingtin CAG length-dependent networks in mice.

Author: A Dobin
A Hodges
A Kuhn
A Pal
A Reiner
A Valencia
Andreas Tebbe
B Zhang
B Zhang
BM Bolstad
C Zuccato
CA Ross
Christoph Schaab
Daniel J Lavery
David Howland
DI Shirasaki
Doxa Chatzopoulou
DV Zaykin
EH Aylward
Eliana Marisa Ramos
ES Deneris
ES Lein
Fuying Gao
G Fishell
GA Smith
Giovanni Coppola
HT Orr
I Al-Ramahi
IS Seong
Ismael Al-Ramahi
J Cox
J Cox
J Labbadia
JA Miller
JC Jacobsen
JE Phillips
Jeffrey P Cantle
Jeffrey S Aaronson
JF Gusella
JF Gusella
Jim Rosinski
JM Dowen
JM Van Raamsdonk
JP Vonsattel
Juan Botas
K Becanovic
K Monahan
K Sharma
Karla El-Zein
LB Menalled
LB Menalled
LB Menalled
LS Kaltenbach
M Biagioli
M Heiman
M MACDONALD
M Mann
M Mielcarek
MA Pouladi
MC Oldham
MI Love
MK Lobo
MT Lin
N Wang
Nan Wang
P Giles
P Grange
P Langfelder
P Langfelder
P Langfelder
P Langfelder
P Langfelder
Peter Langfelder
PF Durrenberger
RC Gentleman
S Anders
S Horvath
S Toyoda
Sandeep Deverasetty
Seung Kwak
Steve Horvath
T Geiger
T Lu
WE Johnson
WV Chen
X Gu
X Gu
X William Yang
XH Lu
Xiao-Hong Lu
Y Benjamini
Y Guo
Y Shen
Y Wang
Yining Zhao
Publication venue: eScholarship, University of California
Publication date: 01/04/2016
Field of study

To gain insight into how mutant huntingtin (mHtt) CAG repeat length modifies Huntington's disease (HD) pathogenesis, we profiled mRNA in over 600 brain and peripheral tissue samples from HD knock-in mice with increasing CAG repeat lengths. We found repeat length-dependent transcriptional signatures to be prominent in the striatum, less so in cortex, and minimal in the liver. Coexpression network analyses revealed 13 striatal and 5 cortical modules that correlated highly with CAG length and age, and that were preserved in HD models and sometimes in patients. Top striatal modules implicated mHtt CAG length and age in graded impairment in the expression of identity genes for striatal medium spiny neurons and in dysregulation of cyclic AMP signaling, cell death and protocadherin genes. We used proteomics to confirm 790 genes and 5 striatal modules with CAG length-dependent dysregulation at the protein level, and validated 22 striatal module genes as modifiers of mHtt toxicities in vivo

Crossref

eScholarship - University of California

Germline polymorphisms in SIPA1 are associated with metastasis and other indicators of poor prognosis in breast cancer

Author: A Ewart-Toland
A Jemal
A Ziogas
Argyrios Ziogas
CL Carter
CT Chung
CT Guy
D Ishida
D Stoppa-Lyonnet
DA Largaespada
David J Peel
DV Zaykin
EL Goode
H Kurachi
Hoda Anton-Culver
James Hess
Kent W Hunter
KW Hunter
L Liu
L Su
M Greco
M Shimonaka
ME Robson
N Platet
N Tsukamoto
Nigel PS Crawford
NP Crawford
RS Houlston
T Lifsted
V Guarneri
V Yajnik
WD Foulkes
WD Foulkes
Y Ohba
YG Park
YG Park
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

INTRODUCTION: There is growing evidence that heritable genetic variation modulates metastatic efficiency. Our previous work using a mouse mammary tumor model has shown that metastatic efficiency is modulated by the GTPase-activating protein encoded by Sipa1 ('signal-induced proliferation-associated gene 1'). The aim of this study was to determine whether single nucleotide polymorphisms (SNPs) within the human SIPA1 gene are associated with metastasis and other disease characteristics in breast cancer. METHOD: The study population (n = 300) consisted of randomly selected non-Hispanic Caucasian breast cancer patients identified from a larger population-based series. Genomic DNA was extracted from peripheral leukocytes. Three previously described SNPs within SIPA1 (one within the promoter [-313G>A] and two exonic [545C>T and 2760G>A]) were characterized using SNP-specific PCR. RESULTS: The variant 2760G>A and the -313G>A allele were associated with lymph node involvement (P = 0.0062 and P = 0.0083, respectively), and the variant 545C>T was associated with estrogen receptor negative tumors (P = 0.0012) and with progesterone negative tumors (P = 0.0339). Associations were identified between haplotypes defined by the three SNPs and disease progression. Haplotype 3 defined by variants -313G>A and 2760G>A was associated with positive lymph node involvement (P = 0.0051), and haplotype 4 defined by variant 545C>T was associated with estrogen receptor and progesterone receptor negative status (P = 0.0053 and P = 0.0199, respectively). CONCLUSION: Our findings imply that SIPA1 germline polymorphisms are associated with aggressive disease behavior in the cohort examined. If these results hold true in other populations, then knowledge of SIPA1 SNP genotypes could potentially enhance current staging protocols

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California